Skip to content

feat(search): EXP-12 — log-spaced recency bins for TR and MR#6

Draft
moralespanitz wants to merge 1 commit intomainfrom
feature/exp-12-log-spaced-recency-bins
Draft

feat(search): EXP-12 — log-spaced recency bins for TR and MR#6
moralespanitz wants to merge 1 commit intomainfrom
feature/exp-12-log-spaced-recency-bins

Conversation

@moralespanitz
Copy link
Copy Markdown

Why

BEAM Sprint 2 dry-run (iter 7 v3) is leaving temporal-reasoning (TR) and multi-session-reasoning (MR/MSR) probes on the table: TR 1/2, MSR 0/2 in the latest run, well below the Honcho TR=0.644 / MR=0.631 reference. Raw created_at timestamps don't match query phrasing like "recently" or "last week"; the retrieval path needs a coarser, scale-invariant signal it can match against.

Log-spaced recency bins are the cheapest first cut at this — Blueprint M5 calls out "AtomicMemory has naive timestamps … No Laplace bank, no drift vector, no scale-invariant representation"; log bins approximate Laplace at ~5% of the cost.

What

EXP-12 from phase2-implementation-plans-2026-04-29.md. Two pieces:

  • Ingest-time tag: every fact written by storeProjection carries metadata.recency_bin (a debug breadcrumb; retrieval recomputes from created_at).
  • Retrieval-time boost: new applyRecencyBinBoost stage in the search pipeline runs after applyCurrentStateRanking and before applyConcisenessPenalty. When the query has a recognizable recency marker, the stage adds recencyBinBoostWeight × computeBinAffinity(queryBin, factBin) to each candidate's score (1.0 for exact bin match, 0.5 for an adjacent bin, 0 otherwise) and re-sorts. Skipped entirely when current-state-ranking already triggered, to avoid double-counting two recency-flavored signals.

Bin schema (and why log-spaced)

1m    →   ≤ 60 s
10m   →   ≤ 10 min
1h    →   ≤ 1 h
10h   →   ≤ 10 h
1d    →   ≤ 24 h
10d   →   ≤ 10 days
100d  →   ≤ 100 days
older →   anything older

Constant ratio (~10×) per rung. Adjacent bins are close in human terms ("yesterday" vs "last week" feel related), so we award half-credit for adjacency. BEAM conversations span minutes-to-months simulated; a fixed-stride scheme would either lose minutes-scale resolution or balloon the bin count.

Config-override copy-paste (defaults-off)

{
  "config_override": {
    "recencyBinBoostEnabled": true,
    "recencyBinBoostWeight": 0.10
  }
}

Both keys are in INTERNAL_POLICY_CONFIG_FIELDS. Env-var equivalents are RECENCY_BIN_BOOST_ENABLED and RECENCY_BIN_BOOST_WEIGHT.

How it composes with EXP-05

Independent. EXP-05 touches extraction-enrichment.ts, memory-service-types.ts, memory-storage.ts:96 (instruction-tag), and search-pipeline.ts:755-761 (instruction boost). EXP-12 touches temporal-fingerprint.ts, temporal-query-expansion.ts, recency-bin-ranking.ts (new), memory-storage.ts:96 (recency-bin breadcrumb), and the same protection-stage block in search-pipeline.ts. Both share that protection-stage block but write distinct stages — instruction-boost goes between current-state-ranking and conciseness-penalty; recency-bin-boost goes immediately after current-state-ranking. The RESERVED_METADATA_KEYS set is the only collision point — EXP-05 added fact_role, EXP-12 adds recency_bin. Both keys are pre-allowlisted for the metadata drift guard.

Test plan

  • src/services/__tests__/temporal-fingerprint.test.ts — bin assignment table tests (boundary cases at 1m, 10m, 1h, 10h, 1d, 10d, 100d, plus future-dated clamp) and computeBinAffinity (exact / adjacent / non-adjacent).
  • src/services/__tests__/recency-bin-ranking.test.ts — boost applied when enabled, no-op on weight=0, no-op on unrecognizable queries, short-circuit when current-state-ranking already triggered, weight applied as configured, recompute-from-created_at (ignores stale persisted hints).
  • src/__tests__/reserved-metadata-keys.test.ts (drift guard) — passes with 'recency_bin' added.
  • src/__tests__/config-partition.test.ts — passes with new keys in INTERNAL_POLICY_CONFIG_FIELDS.
  • npx tsc --noEmit — clean.
  • BEAM TR / MR / MSR slice with recencyBinBoostEnabled: true — pending dry-run iter on feature/exp-12-log-spaced-recency-bins.

Rollback

recencyBinBoostEnabled: false (default). The persisted metadata.recency_bin field is harmless metadata; no migration is needed if the flag is disabled or the feature is reverted.

Adds a coarse log-spaced recency tag at ingest time and a matching
boost stage at retrieval time, behind the new feature flag
`recencyBinBoostEnabled`.

Bins: 1m, 10m, 1h, 10h, 1d, 10d, 100d, older.

At ingest, every fact gets metadata.recency_bin set from now - created_at.
At retrieval, inferQueryBin() reads the query text (e.g. "yesterday",
"last week", "right now") and returns the bin that matches; the boost
stage adds recencyBinBoostWeight to facts whose stored bin matches the
inferred query bin. The stage runs after current-state-ranking and
short-circuits when current-state-ranking already fired (to avoid
double-counting).

Targets BEAM TR and MR. Sprint 2 dry-run on iter 7 v3 had TR 1/2 and
MSR 0/2 — log-spaced bins are the cheapest first cut at temporal
disambiguation.

New config keys (defaults-off):
- recencyBinBoostEnabled: false
- recencyBinBoostWeight: 0.10

Both are config-override-allowlisted for per-request A/B via the BEAM
adapter. RESERVED_METADATA_KEYS extended with 'recency_bin' for the
metadata drift guard.

Behind feature flags. Defaults preserve current behavior.
@moralespanitz moralespanitz requested a review from ethanj as a code owner April 29, 2026 20:32
@moralespanitz moralespanitz marked this pull request as draft April 30, 2026 05:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant